Analysis of a Method Improving Reinforcement Learning Agents' Policies

نویسندگان

  • Daisuke Kitakoshi
  • Hiroyuki Shioya
  • Masahito Kurihara
چکیده

Reinforcement learning (RL) is a kind of machine learning. It aims to optimize agents’ policies by adapting the agents to an environment according to rewards. In this paper, we propose a method for improving policies by using stochastic knowledge, in which reinforcement learning agents obtain. We use a Bayesian Network (BN), which is a stochastic model, as knowledge of an agent. Its structure is decided by minimum description length criterion using series of an agent's input-output and rewards as sample data. A BN constructed in our study represents stochastic dependences between input-output and rewards. In our proposed method, policies are improved by supervised learning using the structure of BN (i.e. stochastic knowledge). The proposed improvement mechanism makes RL agents acquire more effective policies. We carry out simulations in the pursuit problem in order to show the effectiveness of our proposed method.

منابع مشابه

Combinando Modelos de Interação para Melhorar a Coordenação em Sistemas Multiagente

The main purpose of this paper is to implement a hybrid method of coordination from the combination of interaction models previously developed. The interaction models are based on rewards sharing for multi-agent learning, in order to discover interactively good action policies. The exchange of rewards among the agents during their interaction is a complex task and if it is not carried out prope...

متن کامل

Hierarchical Functional Concepts for Knowledge Transfer among Reinforcement Learning Agents

This article introduces the notions of functional space and concept as a way of knowledge representation and abstraction for Reinforcement Learning agents. These definitions are used as a tool of knowledge transfer among agents. The agents are assumed to be heterogeneous; they have different state spaces but share a same dynamic, reward and action space. In other words, the agents are assumed t...

متن کامل

An Adaptive Architecture for Modular Q-Learning

Reinforcement learning is a technique to learn suitable action policies that maximize utility, via the clue of reinforcement signals: reward or punishment. Q-learning, a widely used reinforcement learning method, has been analyzed in much research on autonomous agents. However, as the size of the problem space increases, agents need more computational resources and require more time to learn ap...

متن کامل

Modular Learning Systems for Behavior Acquisition in Multi-Agent Environment

There has been a great deal of research on reinforcement learning in multirobot/agent environments during last decades1. A wide range of applications, such as forage robots (Mataric, 1997), soccer playing robots (Asada et al., 1996), prey-pursuing robots (Fujii et al., 1998) and so on, have been investigated. However, a straightforward application of the simple reinforcement learning method to ...

متن کامل

An Optimal Online Method of Selecting Source Policies for Reinforcement Learning

Transfer learning significantly accelerates the reinforcement learning process by exploiting relevant knowledge from previous experiences. The problem of optimally selecting source policies during the learning process is of great importance yet challenging. There has been little theoretical analysis of this problem. In this paper, we develop an optimal online method to select source policies fo...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

متن کامل
عنوان ژورنال:
  • JACIII

دوره 7  شماره 

صفحات  -

تاریخ انتشار 2003